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EXPLICIT REPRESENTATION OF FINITE PREDICTOR 
COEFFICIENTS AND ITS APPLICATIONS 

By Akihiko Inoue and Yukio Kasahara 1 

Hokkaido University 

We consider the finite-past predictor coefficients of stationary 
time series, and estabiish an expiicit representation for them, in terms 
of the MA and AR coefficients. The proof is based on the aiternate 
applications of projection operators associated with the infinite past 
and the infinite future. Applying the result to long memory processes, 
we give the rate of convergence of the finite predictor coefficients and 
prove an inequality of Baxter-type. 

1. Introduction. Let {X^} = {X^:k £ Z} be a real, zero-mean, weakly 
stationary process denned on a probability space (£l,J-,P), which we shall 
simply call a stationary process. We denote by H the real Hilbert space 
spanned by {X^ : k £ Z} in L 2 (J7,^ r , P). The norm of H is given by ||Y|| : = 
£?[Y 2 ]V 2 . ;p or n £ N, we denote by H^_ n _^ and Hi^-o the subspaces 
of H spanned by {X- n , . . . ,X-i} and {X^ : k < —1}, respectively. We write 
P[-n-l\ an d P(_oo,-l] for the orthogonal projection operators of H onto 
H[-n,-i] an d ^(-oo,-l]> respectively. The projection Pj_ n „ 1 ]Xo (resp., 
Pr^oo^i] Xq) stands for the best linear predictor of the future value Xq based 
on the finite past {X_ n , . . . , X_{\ (resp. the infinite past {X^ :k< —1}), and 
its mean square prediction error is given by a 2 := \\Xq — Pr_ n ._i]Xo|| 2 (resp. 
a 2 HlXo-iVoo^Xoll 2 ). 

For nondeterministic {X^} (see Section 2.1), the finite predictor coeffi- 
cients 4> n: j are the uniquely determined ones in 



(1-1) P[„ ?w] X = 5> nj X_ 
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As is well known, w6 can calculate the numerical values of • ■ • 7 ^n,n? as 
well as the mean square prediction error cr^, from the values 7(0), . . . ,7(77,) 
of the autocovariance function of {X^}, using recursive algorithms such as 
the Durbin-Levinson algorithm (see, e.g., Section 5.2 in [6]). The recursive 
methods are of great practical importance in time series analysis. However, 
they are not necessarily effective in problems of a theoretical character, in 
particular, those related to the asymptotic behavior as n — > 00. 

A classical problem of this type is the rate of convergence of o\ — a 2 { as 
n — > 00. See, for example, [8], where references to earlier work — by Grenan- 
der and Rosenblatt, Grenander and Szego, Baxter, Ibragimov and many 
others — are given. The arguments in these references are closely related to 
the theory of orthogonal polynomials as described in [10, 26, 27]. 

A new approach to a related problem was introduced by Inoue [15] . For the 
partial autocorrelation coefficients a(n) = cj) n ^ n of a stationary process {X^} 
with short or long memory, the asymptotic behavior of |a(n)| asn->oo was 
obtained using a representation of the mean square prediction error a 2 in 
terms of the MA (moving-average) coefficients and the AR (autoregres- 
sive) coefficients au (see Section 2.2 for the definitions of and a*;). By the 
same approach, but with extra complication, similar results on |a(n)| were 
obtained in [17, 18] for the fractional ARIMA (autoregressive integrated 
moving-average) processes. The fractional ARIMA model is an important 
parametric model including a class of long memory processes. It was in- 
dependently introduced by Granger and Joyeux [9] and Hosking [13] (see 
Example 2.6). The advantage of such an approach, that is, that via c\~ and 
afc, has become more apparent in [16] where a representation of the partial 
autocorrelation function a(-) itself, in terms of c& and afc, was derived. The 
representation enabled us to study the behavior of q(-) more directly, and 
thereby to improve results in several ways. In particular, the asymptotic 
behavior of a(n) as n — > 00, rather than that of |a(n)|, was obtained. 

In this paper our main interest is in the finite predictor coefficients 4>n,j ; 
which are among the most basic quantities in the prediction theory for {X^}- 
After we establish an explicit representation of the type above for <f> n ,j, that 
is, that in terms of the MA coefficients and the AR coefficients a^, we 
provide two applications of the representation to long memory processes. 

For n 6 N, we write H^_ n ^ for the subspace of H spanned by {X^ : k > 
—n} and -P[_ nj00 ) for the orthogonal projection operator of H onto i?[_ nj0 o)- 
To prove the representation of (f> n j f we use an approximation scheme based 
on the alternate applications of the projections P^^^ij and -Pr_n i0 o)- 111 so 
doing, the following equalities play a key role: 

(I- 2 ) #(-00 -1] n #[_n,oo) = H[-n ,-!]> 71=1,2,... 
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(see Theorem 2.2). For example, it is known that (1.2) holds if {A n } is 
purely nondeterministic and has spectral density A(-) such that 

f n 1 

(1.3) / — — d\<oo 



-* A(A) 

(Theorem 3.1 in [15]). We discuss the equivalence between (1.2) and complete 
nondeterminism (see Theorem 2.3 and Remark 2). 

When Wiener's prediction formula (1.4) below is available, we thus ob- 
tain a representation of 4> n j in terms of Ck and ajt (Theorem 2.5). However, 
in applications it is essential that <p n j be expressed in terms of absolutely 
convergent series made up of Ck and ak- We derive such an expression (The- 
orem 2.9) under additional conditions on Ck and ak, that is, (Al) or (A2) 
in Section 2.3. The condition (Al) corresponds to short memory processes, 
and (A2) to long memory processes. 

The first application of the representation of 4> n j concerns the rate of 
convergence of (p n j toward its limit as n — > oo. Under suitable conditions, 
4>n.j converges to the infinite predictor coefficient <j>j in 

oo 

(1-4) P ( _ OOi _ 1] X = ^^-i- 

3=1 

The rate at which n j converges to <j)j is a fundamental problem in predic- 
tion theory and time series analysis. A textbook treatment of this problem 
can be found in [22], Section 7.6. Using the representation of (f> n ,j, we show 
that 



oo 



k=2 



9k(n,j) 



where gk(n,j) is a function of {ck} and {ak} [see (2.28)], and we examine the 
convergence rate for a long memory process whose autocovariance function 
7(-) is regularly varying at infinity with index — p for some p E (0,1). It is 
shown that liuin—,^ n{<f> n j — exists, and the limit is calculated exactly 
in terms of p and {</>&} (Theorem 3.3). It is interesting that the rate of 
convergence does not depend on p. 

The second application of the representation of <f> n j is related to the 
additional error ||P[_ n _i]Xo — ^™ =1 ^A_j|| that arises when we use the 
infinite predictor coefficients <f>j instead of the finite ones 4> n ,j- There exists 
a known inequality that deals with this problem, and is commonly referred 
to as Baxter's inequality (see [1]; see also [3, 7] and Section 7.6.2 in [22]). It 
takes the form 

n oo 

(1-5) Y,\^,j-H<M £ |0 fc |, 

j=l k=n+l 
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with finite positive constant M. The original inequality (1.5) of Baxter was 
an assertion for short memory processes. By simple arguments based on the 
representation of (j> n j, we prove (1.5) for long memory processes, including 
the fractional ARIMA processes (Theorem 4.1). 

In Section 2 we prove the representation of the finite predictor coefficients 
4> n ,j- In Section 3 we apply it to show the rate of convergence of <j) n ; for 
long memory processes. In Section 4 we apply the representation to prove 
an inequality of Baxter- type for long memory processes. 

2. Finite predictor coefficients. Let {^ n } = {X n : n G Z} be a stationary 
process; as stated in Section 1, this means that {X n } is a real, zero-mean, 
weakly stationary process defined on a probability space (f), J 7 , P). The au- 
tocovariance function of {X n } is defined by 

j(n) := E[X n X ], n G Z. 

As we also stated in Section 1, we denote by H the closed real linear hull 
of {Xk : k G Z} with respect to the norm \\Y\\ := .E[y 2 ] 1//2 . Then H is a real 
Hilbert space with inner product (Y, Z) := E[YZ] . For n, m G Z with n < to, 
we write i?(_oo )n ], H\ n> oo)i ^[n,m] an d -^{n} f° r the closed subspaces of H 
spanned by {X^ : — oo < k < n}, {X^ :n<k< oo}, {X^ :n <k < m} and X n , 
respectively. Notice that H{ n } = ^[n.n]- For an interval I, we write Pi for 
the orthogonal projection operator of H onto Hj. 

A stationary process {A" n } is said to be purely nondeterministic (PND) 

if 

oo 

fl H (-oo,n]={0}. 

n=— oo 

If there exists an even, nonnegative and integrable function A(-) on [— tt,it] 
such that 

7 (n)= r e inX A(X)d\, neZ, 

J — 7T 

then A(-) is called a spectral density of {X n }. As is well known, {X n } is PND 
if and only if it has a positive spectral density such that \ log A(A)| dX < 
oo (see, e.g., Chapter II in [23]). 

2.1. Convergence of an approximation scheme. Let Y G H. If {X^} is 
nondeterministic, that is, Xq ^ i^r-oo,— 11) then A"_ n ,...,A_i are linearly 
independent, whence we can express the predictor P[_ n ^i]Y uniquely in the 
form 

n 

(2-1) Py^^Y = Y J M Y ) X -3- 

3=1 
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In this section we prove the convergence of an approximation scheme for 
computing the real coefficients (j> n j(Y). 

For n, k G N, we define the orthogonal projection operator P% by 



(2.2) P* : 



oo,— 1]; K — 1,3,5, 

P[-n,oa), k = 2,4,6, 



It should be noticed that {P% :/c = l,2,...}is merely an alternating sequence 
of projection operators, first to the subspace #(-00,-1], then to #_ n;0 o)> and 
so on. 

Lemma 2.1. Assume that {X n } is nondeterministic. Let Y be an arbi- 
trary element of H . Then, for n,k G N, there exist unique real coefficients 

as G Hi 

— oo,— n— i] Jot k odd and Z^ G .ff^oo) 

for k even, such that 

n 

P'Pt 1 ■ ■ ■ Pfr = £ <p k njY)X^ + Z k n . 

i=i 

Proof. We assume that k is odd. From Lemma 6.1 in [22] (Regression 
Lemma), it follows that 

(2.3) #(-00,-1] = #(-oc ,-n-i] + H [-n,-i] (direct sum) 

(see the proof of Theorem 6.3 in [22]). Since X_ n , . . . , X_\ are linearly inde- 
pendent and P^Pn~ l ■ ■ ■ P^Y G #(-00,-1] , the lemma for k odd follows. The 
case in which k is even is proved in a similar fashion. □ 

It is natural to ask if JY) converges to 4> n j(Y) as k — > oo. 

Theorem 2.2. We assume that 

(2.4) {^n} is nondeterministic and satisfies (1.2). 
Then we have 

(2.5) <KiCn = Jim 4>n,j(Y), Y G H, n G N, j = 1,. . . ,n. 
In particular, (2.5) holds if 

(2.6) {-^n} is purely nondeterministic and satisfies (1.3). 

Proof. The condition (1.2) and von Neumann's alternating projection 
theorem (see, e.g., Theorem 9.20 in [22]) yield 

(2.7) s-lim P^Pt 1 • • • Pi = P[- n ,-i] , n = 1, 2, . . . . 
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We put 

(2.8) e k := X k - P^^Xk (keZ). 

Then, from Lemma 2.1 we see that 

(p 2fc+1 p 2fc • • • pX e _o = ^+\Y) • II 2 - 

By (2.7), the left-hand side tends to (Pr_ n! _i]Y, e_i) as k — > oo. Thus, o ni i := 
linifc^oo <j>^^ l (Y) exists. In the same way, letting k— ► oo in 

(P 2 * +1 P 2 * •••P n 1 F, £ _ 2 ) = ^ +1 (F) • |k_ 2 || 2 + <ft +1 (F) • (X_ 1)£ _ 2 ), 

we find the existence of a n)2 := lim^oo (fy^^ 1 (Y) . Repeating this argument, 
we see that a n j := lim^oo 2fc ^" 1 (y) exists for all j = 1, . . . , n. Hence, Z n := 
linifc^oo ^ 2fc+1 also exists in and we have 

n 

Z n = P[_ n) _i]V — a H: jX-j. 
i=i 

Since the right-hand side is in Pr_ nj _i], so is Z n . Moreover, Z n G P^^^.^ 
since, for every k> 1, Z^ k+1 belongs to the closed subspace i?(_ 00) _ n _ 1 ] . 
Combining, Z n G HHf^^^iy However, by (2.3) this implies Z n = 

0. Thus, P[_ n _i]Y = Y^j=i a n,jX-j. By uniqueness, we obtain <fi n> j(Y) = 
a n j = linifc^oo <j) 2k + l (Y). Similarly, we have 4> n ,j(Y) = lim fc ^oo ^(Y). Thus, 
(2.5) follows. Finally, by Theorem 3.1 in [15], (2.6) implies (2.4), whence 

(2.5) . □ 

Remark 1. A stationary process {^n} is said to be minimal if Xq does 
not belong to the closed linear span of {X k : G Z, k ^ 0} in H . By Theorem 
24 in [20], (2.6) is equivalent to saying that {-^n} is purely nondeterministic 
and minimal. The condition (2.6) is also equivalent to another property 
called pure minimality (see [21, 24] and Theorem 8.10 in [22]). The condition 

(2.6) holds in most interesting examples, and we can easily check it. 

Since the assumption (2.4) is a key to our arguments, we are interested 
in its characterization. The next theorem gives such a result. 

Theorem 2.3. The condition (2.4) is equivalent to 
(2-9) %oo,-i]n%oc) = {0}. 

Proof. First we assume (2.4). Then 

#(-oo ,-l] n #[0,oo) C H {-oo~i] n ^[-l,oo) = ^{-l}; 

P(-oo,-l] n P[0,oo) C P(-oo,0] n -#[0,oo) = -^{0}> 
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while X—i and Xq are linearly independent since {^n} is nondeterministic. 
Thus, (2.9) follows. 

Next we assume (2.9). By the arguments in [12], page 6, we see that 
{X n } is PND, whence, in particular, nondeterministic. Let n6N and X £ 

(-oo.-il !~! H[-n,oo)- By the Regression Lemma, X has a decomposition 
X = Y + Z with Y £ _ n _i] and Z £ #[_ n _i]. Then F = X - Z £ 

i?(-oo,-n-i] nif[_ n;00 ). However, (2.9) implies F ( _ 00 _ n _ 1] fl -ff[- n ,oo) = {0}, 
so that 7 = 0orI=Ze iZj_ n _i]. Thus, #(-00,-1] <~i fl[_ n|0 o) C i?[_ n _i]. 
Since the converse implication D is trivial, we obtain (1.2), whence (2.4). 
□ 

A stationary process {X n } is said to be completely nondeterministic if (2.9) 
holds. Thus, Theorem 2.3 asserts the equivalence between (2.4) and the com- 
plete nondeterminism of {X n }. Complete nondeterminism was introduced 
by Sarason [25]. 

Remark 2. In the first version of this manuscript, we raised the charac- 
terization of (2.4) in terms of the spectral density A(-) as an open problem 
after remarking that (2.4) implies (2.9), whence that {X n } is PND. In the 
summer of 2004, Mohsen Pourahmadi, and then an anonymous referee, sug- 
gested the equivalence between (2.4) and (2.9), and both cited Bloomfield, 
Jewell and Hayashi [5], in which several characterizations of complete non- 
determinism (2.9), in terms of the outer function determined by A(-), which 
is essentially the same as h(z) in (2.11) below, are given. Thus, we owe much 
of Theorem 2.3 to them. 

2.2. Representation in terms of MA and AR coefficients. In this section 
we assume that the stationary process is purely nondeterministic. 

For n £ N and m £ N U {0}, we can express the (m + l)-step predictor 
P\_ n _i\X m uniquely in the form 

n 

(2-10) P [ _ n ,_ 1] X m = ^^.X_ i . 

i=i 

We are concerned with representation of the real coefficients which we 
call the (m + \)-step finite predictor coefficients. In the 1-step case m = 0, 
we have = <p n>j by (1.1). 
We consider the outer function 

(2.11) h(z) :=v^7rexp(j- f \ + * log A(A) d\\, z£C,\z\<l. 

1 47r J- n e lA — z J 

The function h(z) is holomorphic and has no zeros in \z\ < 1, and it satisfies 
2vrA(A) = \h(e iX )\ 2 a.e., where h(e iX ) := lim^i h(re iX ). We define the MA 
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coefficients c n by 

oo 

h(z) = Y,c n z n , \z\<l, 

n=0 

and the AR coefficients a n by 



oo 



-l/h(z) = J2a n z n , \z\<l 

n=0 

(see Section 2 in [15]). Both {c n } and {a n } are real sequences, and we have 
co > and 2~^o°( c n) 2 < 00 • The coefficients c n and a n are actually those that 
appear in the following MA(oo) and AR(oo) representations, respectively, 
of {X n } [under suitable condition such as (2.15) below for the latter]: 

n 

(2.12) X n = °n~^h n£Z, 



(2.13) a n - j X j + £ n = 0, nGZ, 

j=-oo 

where is the innovation process given by = £fc/||ejfc|| with Ek in (2.8); 
see, for example, Chapter II in [23] for (2.12), and (4.9) in [15] for (2.13). 
By the assumption that {X^} is PND, forms a complete orthonormal 
system of H such that, for every n G Z, the closed linear span of {£& : — oo < 
k < n} in H is equal to Ht^^y Notice that the sums in (2.13) may not 
converge in norm in H . 

Example 2.4. Let r€ (-1,1). We consider the unique causal solution 
X n = J2l=-oo r n ~ 3 ej to the AR(1) equation X n = rX n _i + e n , where {e n : n G 
Z} is white noise, that is, a sequence in H such that (e n ,e m ) = 6 nm (see, 
e.g., Section 4.1.1 in [22]). By standard computations, we find the equalities 

r M ii i 

£n = e n , 700=i 2' A ( A ) = ^~M iAI2' h (z) = -. , 

1 — r z 2tt \1 — re lA \ z l — rz 

c n = r n (n>0), ao = — 1, a±=r, a n = (n>2). 



We put 

m 

b T -=Y. c k a i+m-k, m,i£NU{0}. 

k=0 

In particular, b® = c^aj. For n G N and m,j G N U {0}, we define b™{n, j) 
recursively by 

b?(n,j) = bf, 

(2.14) 

OO 

bT + i(n,j)= £ k = l,2,.... 

mi=0 
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From the proof of Theorem 2.5 below, we see that, under the condition 

oo 

(2.15) ^ | On | < oo, 

n=0 

which ensures the absolute convergence of the sums in (2.13), the sums 
in (2.14) also converge absolutely. We put, for m G N U {0}, n G N and 
j = 1,2,..., re, 

n m (r) (b^inj), fc = l,3,..., 

9kK T hJ)- ^bmfan + i-j), jfc = 2,4,.... 

We write f° r the improper sum: = hmjvf— >ooS M - The fol- 

lowing theorem gives an explicit representation of the (m + l)-step finite 
predictor coefficients 0™ ■ in (2.10), in terms of the MA and AR coefficients, 
under the absolute convergence of the sums in (2.13). 

Theorem 2.5. We assume that the AR coefficients a n of a purely non- 
deterministic stationary process {X n } satisfy (2.15). Then we have 4>™j = 
J2k } =i9k'( n d) for re G N, m G NU {0} and j = l,...,n, that is, 

n ( oo— ~* 

j=i U=i J 

Proof. For m G N U {0} and n G N, we have the Wiener prediction 
formulas (see, e.g., Theorem 4.4 in [15]) 

oo 

(2-16) P { . oc> _ 1] X m = Y / bfX_ j , 

3=1 
oo 

(2-17) P[_ n oo )X_ n _i_ m = ^2 b^X-n-i+j, 

3=1 

the sums converging absolutely in ii". Recall P^ from (2.2). From (2.16), we 
have 

n oo 
j=l mi=0 

From this and (2.17), it follows that 

n oo oo 

p n 2 p n i x m =^<(n,j)x_ J + x: cu+^E&r*-- i+i 

j=l mi=0 j=l 

= E{9r(n,i)+5 2 m (n,j)}A_, 
3=1 
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oo oo 
mi=0 m2=0 

Similarly, 

n 
3=1 

oo oo oo 

+ C+l+roi Z^ fyi+l+mo ^ 6j 2 X-j 

mi=0 m2=0 3=1 

n 

= I>5>, j) +g?(n,j)+g™(n,j)}X_ :i 

3=1 

oo oo oo 

, \ ^ Lm \^ imi \ ^ « v" 

W n+l+mi 2^ U n+l+m 2 Z^ "n+l+m 3 A - 
mi=0 m2=0 r?i3=0 

Repeating this argument, we see that <fi n j{X m ) in Lemma 2.1 with Y = 
X m are given by 4> n j{X m ) = Y^d=\9Ti n ^)- The condition (2.15) implies 
Z)o°( a «) 2 < °°; whence (1.3) (see, e.g., Proposition 4.2 in [15]). Thus, the 
theorem follows from Theorem 2.2. □ 

2.3. Representation by absolutely convergent series. In the applications 
which we discuss later, the finite predictor coefficients 4>n.j in (1-1) need to 
be expressed by an absolutely convergent series made up of a& and Cfc. In 
this section we first give such an expression for 6™(n, j). In the 1-step case 
m = 0, the result yields the desired representation for (j) n ,j- 

We write TZq for the class of slowly varying functions at infinity: the 
class of positive, measurable £(■), defined on some neighborhood [A, oo) of 
infinity, such that lim^^oo £(Xx) /£(x) = 1 for all A > (see Chapter 1 in [4] 
for background). 

Throughout this section we assume that the stationary process {X n } sat- 
isfies one of the following conditions (Al) and (A2): 

(Al) {X n } is purely nondeterministic, and {o n } and {c n } satisfy, respec- 
tively, (2.15) and 

oo 

(2.18) ^|c n |<oo. 

n=0 

(A2) {X n } is purely nondeterministic and, for d £ (0,1/2) and £(■) £TZo, 
{c n } and {a n } satisfy, respectively, 

(2.19) c n ~n- (1_d V(n), n -> oo, 

(2.20) a n ~n ^+ d > 5: — L n ^ oo. 

l[n) 7r 
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It should be noticed that (2.20) implies (2.15). 

In this paper we say that a stationary process {X n } has long memory 
(resp. short memory) if J^fcL-oo = 00 (resp. < oo). See [2], page 6, 

and Section 13.2 in [6]. By (2.12), the autocovariance function 7(-) has the 
expression 

oo 

(2.21) 7(n) = Y^ c \n\+kCk, ra£Z. 

fc=o 

Hence, (2.18) implies that 

oo / oo \ 2 

£|7(")I < Z>fc| <°°- 

n=0 \fc=0 / 

Thus, {^n} has short memory under (Al). On the other hand, by (2.21) 
and [14], Proposition 4.3, (2.19) implies that 

(2.22) j(n)~n- {1 - 2d h(n) 2 B(d,l-2d), n->oo. 

Since < 1 — 2d < 1, we see that {A^ n } has long memory under (A2). We re- 
mark that, under suitable conditions, (2.19), (2.20) and (2.22) are equivalent 
(see Theorem 5.1 in [15]). 

Example 2.6. For d G (—1/2, 1/2) and p, q G NU {0}, a stationary pro- 
cess {A n } is said to be a fractional ARIMA(p, d, q) process if it has a spectral 
density A(-) of the form 

where 4>{z) and 8(z) are polynomials with real coefficients of degrees p and 
q, respectively. We assume that (p(z) and 9{z) have no common zeros, and 
that neither (ft(z) nor 9{z) has zeros in the closed unit disk {z £ C : \z\ < 1}. 
We also assume without loss of generality that 6(0) /(f)(0) > 0. Then the outer 
function h(-) is given by h(z) = (1 — z)~ d 9(z)/cf)(z) (see, e.g., Section 2 in 
[17]). If < d < 1/2, then {X n } satisfies (A2) for some constant function £(■) 
(see Corollary 3.1 in [19]). If d = 0, then {X n } is also called an ARMA(p, q) 
process (see Chapter 3 in [6]), and both {c n } and {a n } decay exponentially, 
whence (Al) is satisfied. 

We put 

oo 

B n :=^2\c v a n+V \, raGNU{0}. 
v=0 
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For n, k, u, v G N U {0}, we define Dj~(n, u, v) recursively by 
D (n,u,v) := 5 UV , 

oo 

D k+ i(n,u,v) := X B rl+v+w D k (n,u,w). 

w=0 

We have, for example, 

oo oo 

D 3 (n,u,v) = X X Bn 

+v+vi B n +vi +D2 B n + V2 • 

«1=0 f2=0 

By the Fubini-Tonelli theorem, we have D k (n,u,v) = D k (n,v,u). 

Lemma 2.7. We assume either (Al) or (A2). T/ien, /or fc,n,u G N U 
{0}, 

oo oo 

X Dk(n, u, v) < oo and X^ Dk(n, u, v) 2 < oo, 

M=0 u=0 

respectively. In particular, we have D k (n,u,v) < oo for k,n,u,v G NU {0}. 
Proof. First we assume (Al). Then 

oo ( oo "\ ( oo ^ 

X B m < j X l c «l n X l a «l f < 00 • 

m=0 lu=0 J lu=0 J 

This and the nonnegativity of B m imply, for example, 

oo oo oo oo 

X] D 3 (n,u,v) = X X X B Tl+v+Vl B n+Vl+V2 B n+V2+u 

u=0 u=0 i;i=0 V2=0 

{oo ~j 3 

X B rn I < OO. 
m=0 J 

The general case can be proved in the same way. 

Next we assume (A2). The proof in this case is the same as that of 
Lemma 2.1 in [16]. By (A2) and Proposition 4.3 in [14], we have B n = 
0{n~ l ) as n — > oo. Therefore, for n G N, /„ i— > J2"^=o B n +u+v fv defines a 
bounded linear operator on I 2 (see Chapter IX in [11]). Since D k +i(n,u,v) = 
J2 W B n + u + w D k (n,w,v), we obtain the desired result by induction on k. □ 

We put 

oo 

(2.23) (3 n := X c D fl «+n, n = 0, 1, .... 

v=0 
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In view of Lemma 2.7, we may define 5 kin, u, v) recursively by, for k, n,u,v G 
NU{0}, 

S (n,u,v) = S uv , 

(2.24) 

OO 

5 k+ i(n,u,v) = f3 n +v+ w 5k(n,u,w). 

w=0 

By Lemma 2.7 and the Fubini theorem, we have 5k(n,u,v) = 5k(n,v,u). 

The following theorem expresses b™(n,j) as an absolutely convergent se- 
ries. 

Theorem 2.8. We assume either (Al) or (A2). Then, forn,k G N and 
m,j'eNU{0}, 

m oo 

(2.25) b™{n,j) = Y^Cm-vYl a j+^k-i(n+l,u,v), 

v=Q u =0 

the sum converging absolutely. 

Proof. By Lemma 2.7 and (2.15), we have 
^ \a j+u \D k _i{n+l,u,v) 



u=0 

(2.26) 



< <^ supD fe _i(n + l,u,v) \ Y \ a j+u\ < 



oo. 

Thus, the right-hand side of (2.25), which we denote by B™in,j), converges 
absolutely. To prove the proposition, it is enough show that B™(n,j) satisfies 
the same recursion as (2.14). 
First we have 

m oo m 

i>=0 11=0 v=0 

as desired. Next, the Fubini-Tonelli theorem and (2.26) yield, for k > 1, 

oo 

^2a j+u S k (n + l,u,v) 

u=0 

oo oo ( oo ^ 

u=0 io=0 lmi=u) J 



mi 

toi=0 ui=0 u=0 



^ a n+l+u+mi ^ c mi _ w ^ aj +u ^_i(n + 1, 
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oo 

= X] a n+l+v+m 1 B^ ll (n,j), 
mi=0 



so that 



B k+i( n ii) = ^2c m _ v a-n+i+v+rmB^inJ) 



V=0 7711=0 

oo ( m 



= \Y2 c m-vOn+l+rni+v\ B k' 1 (n,j) 
mi=0 U=0 J 

oo 

= £ b% +1+mi B^(n,j). 

mi=0 

Thus, B^(n,j) satisfies (2.14). □ 

For applications in later sections, we consider the case m = separately. 
We put 

d k {n,j) :=5 k {n,0,j), n, j G N U {0}. 
Then, by (2.24), d k (n,j) satisfies the following recursion: for k,n,j G Nu{0}, 

do(n,j) = 5 j0 , 

(2.27) 

oo 

d k+ i(n,j) = ^2(3 n+j+v d k (n,v). 

v=0 

More explicitly, d k (n,j) are given by, for n, j G N U {0}, 

oo 

di{n,j) =Pn+j, d 2 (n,j) = X /Wi+m/^n+vi) 

di=0 

and, for k = 3, 4, ... , 

oo oo 

+-yfe_lAi+ffe_i+f fe _2 " " ' Pn+v 2 +v 1 /3n+v 1 , 

vi=0 Dfe_l=0 

the sums converging absolutely. 
We put 

h{n,j) := b k {n,j), g k (n,j) := ^(n, j) 

for (k,n,j), for which the right-hand sides are defined. Then, for n G N and 
j = 1, 2, ... ,n, we have 

(2.28) 5fc(«,i) = |?^ n,i) ', 1 ^ ? = o 

V ; y V ' Jy \6 fc (n,n + l-j), fc = 2,4,.... 

By Theorems 2.5 and 2.8, we immediately obtain the following final form 
of the representation of the 1-step finite predictor coefficients <f) n j. 
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Theorem 2.9. We assume either (Al) or (A2). Then, for n G N and 
j = 1, . . . ,n, we have (p n j = Efc=L 9k(n,j) with {2.28) and 

bi(n,v) = coa v , v>0, 

oo 

bk(n,v) =co^2<i v+u dk-i(n + l,u), k>2, v>0, 

u=0 

the sum on the right-hand side converging absolutely. 

3. The rate of convergence of finite predictor coefficients. If the sta- 
tionary process {X n } is PND and satisfies (2.15), then we have the Wiener 
prediction formula (2.16) with m = or (1.4) with 

(3.1) <j)j = c aj, j'eN. 

We call <f)j the infinite predictor coefficients. It holds that 

lim <f) n j = 4>j, j £ N 

n— >oo " J 

(see, e.g., Theorem 7.14 in [22]). In this section we investigate the rate for 
long memory processes at which <f> n j converges to cpj. Notice that, by (2.14), 
(2.28) and (3.1), we have 

(3.2) <f>j = h(n,j)=gi(n,j), neN, j = l,...,n. 

Thus, <f)j is the first term of the series Y^k=i 9k{ n jj) m Theorem 2.9 express- 
ing 4> n ,j- This suggests the usefulness of the expression for our purpose. 

Throughout this section, we assume that the stationary process {X n } 
satisfies (A2) in Section 2.3 (long memory). 

For u > 0, we put 

f ( \ 1 tf\ 1 f°° dsi 

Ji(«) := -7^— — r, J2{u) :-- 



7T(1 + <!/)' Wo (S1 + 1)(S1 + 1 + U)' 

and, for k = 3, 4, ... , 

1 roc roo Y 

fk(u) ■=— ds k -x--- j dsi 



vr fc Jo Jo (s fe _! + 1) 

k-2 -, -\ ^ 



n 



^ (s m+ i + s m + 1) J (si + 1 + u) 
(see Section 3 in [17]; see also Section 6 in [15]). 

Lemma 3.1. (i) Y,V=i f2k(0)x 2k = {tt~ 1 arcsinx) 2 for \x\ < 1; 
(ii) ET=i f2k-i{0)x 2k ~ 1 = vr" 1 arcsinx for \x\<l. 
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Proof. Let j > 1. We easily see that fi + j(u) = J °° f\(s + u)fj(s) ds for 
u > 0. Hence, we have, for u > 0, 



oo f roo 



f 2+j (0)= I h(a)f j+x {s)da 

dsfi(s) / fi(u + s)fj(u)du 
Jo 

fi(s + «)/i(s) (is >fj{u) du 

f2{u)fj{u)du. 
Repeating this argument, we obtain 

poo 

(3.3) / f i {u)f j {u)du = f l+J (0), i,j G N. 



uo 

oo 



Thus, the assertion (i) follows from Lemma 6.5 in [15], while (ii) follows from 
Lemma 3.4 in [16]. □ 

Recall dk(n,u) from Section 2.3. 

Proposition 3.2. (i) For r G (l,oo), £/iere exists JVeN suc/i t/wif 

(3.4) < d k (n, u) < /fc(°)i r sm M)i fc uGNU{0}, ft G N, n> N. 

n 

(ii) For G N and u G NU {0}, df-(n,u) ~ re _1 /fc(0) sin fc (7rd) as oo. 

PROOF. Let r > 1. Recall /?„ from (2.23). The condition (A2) implies 

sin(-7rd) -, 

(3.5) fa i — J - n ~\ n->oo 

(see Proposition 4.3 in [14]). Thus, for n large enough, 

r 1 / 2 sin(7rd) „ T , , 

< f3 [ns]+n+u < \ ' s > 0, u G N U {0}. 

1 J 7r([ns] + n + u) 

Since we have, for n large enough, 

1 r 1 / 2 

<— — , s>0, uGNU{0}, 



[ns\+n + u n(s+l)' 
there exists N\ G N such that 

(3.6) < f3 [ns]+n+u < B n- 1 , s > 0, u G N U {0}, n > iVi. 

1 1 7T(S + 1) 
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In the same way, we can choose iV 2 so that 

rsm(ird) i 

(3.7) < (3 [nS2]+[nsi]+n < — — — — -n , si,s 2 > 0, n > N 2 . 

7T\S2 *T" S\ -\- L J 

Therefore, we have, for n> N := max(iVi, A^), 

roo roc 

< d 3 (n, u) = ds 2 ds\ ■ f3[ S2 \ +n ■ /3[ S2 ]+[ Sl ] +n • /3[ Sl ] +n+u 

i roc 

ds2 / ds\ ■ /3[ nS2 ] +n • /3[ nS2 ] + [ nsi ] +n • f3y nsi \ +n+u 



n 2 



{rsm(ird)} 3 1 f°° , f°° , 1 



n vr 3 7 7o (s 2 + l)(s 2 + «i + + 1) 
{rsin(7T(i)} 3 



71 



-/s(0), 



which implies (3.4) with k = 3. Notice that iV is independent of the choice 
k = 3. We can prove (3.4) for general k and the same N in a similar fashion. 

We also prove (ii) only for k = 3; the general case can be treated in the 
same way. By (3.5), we have 

(3.8) lim n/3 [ns]+n+u = \ \ , s > 0, uGNU{0}, 

n >oo 7T{S ~r 1 J 

sin(7T(i) 

(3.9) lim^ w+w+n = - ■ — , si,s 2 >0. 

By (3.6)-(3.9) and the dominated convergence theorem, we obtain 

/•OO fOG 

lim / ds 2 I ds\ ■ n/3i ns i +n • n/3r ns i + r ns i +n • n/3r ns 

n— >oo / n Jg 

sin 3 (7T(i) f°° -, f°° -, 1 

as 2 / as 



^ 3 Jo ""Vo 1 (s 2 + l)(s 2 + si + l)(si + l)' 
This implies lim n nd^(n, u) = sin 3 (it d)f^(0) or (ii) with = 3, as desired. □ 

The following theorem gives the rate for long memory processes at which 
(j) n j converges to <pj. It applies, in particular, to the fractional ARIMA(p, d, q) 
processes with < d < 1/2. 

Theorem 3.3. We assume (A2). Then we have, for j £ N, 

oo 

lim n{(j) n j - <f>j} = d 2 V <j> u . 

n— >oo J z — ' 
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Proof. Let r > 1 be chosen so that < rsin(7rd) < 1. By Lemma 3.1, 

oo 

(3.10) ^/ fc (0){rsin(7rd)} fc <oo. 

k=l 

Let N be such that (3.4) holds. Then, for n > N and j = 1, . . . , n, 



n 



Y a n-j+u Y d 2k-X (n, I 
u=0 k=l 

oo oo 
< Y, \ a n-j+u\ Y, nc ^2fe-l 



u=0 

oo 



k=l 



< 



2/2fc-i(0){rsin(ird)} 
.k=i 



2k -l 



Y i a «i 



u=n—j 



so that 



liin^ n ^ a n - j+u Y d 2 k-i (n, u) = 0. 

^ u=0 fe=l 

Proposition 3.2, Lemma 3.1 and the dominated convergence theorem yield 



lim nVa j+ „V4( 



n,u) 



u=0 k=l 
( oo 



Y hk(0) sin 2fc (vrd) Y, = d2 E a - 



>fc=l 



u=0 



Therefore, by Theorem 2.9 and (3.2) we have 
lim n{(j) n -x j - cpj} 

f oo oo " 

= lim <^ nV() a+1 (ii- + ra V6 2 fc(n- j) 

n — ^oo ^ — * * — * 



fc=i 

00 



fc=l 



= Hm^ c n Y a j+u Y d 2fc("-, u) + lkn^ c n On-j+u Y d 2k-i{n, u) 

u=0 k=l u=0 k=l 

oo oo 

= c d 2 Y au = d2 Yl 0«> 

as desired. □ 

The next proposition, which follows from the proof of Theorem 3.3, shows 
that, under (A2), the sum YlT=i9k{n,j) converges absolutely for n large 
enough and j = 1, . . . , n. 
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Proposition 3.4. We assume (A2). If N is such that (3.4) holds for 
some r £ (l,oo), then J2k=i \9k{ n ,j)\ < oo for n> N and j = l,...,n. 

For processes with short memory, we have the following result. 

Proposition 3.5. We assume (Al). If N is such that (X^ol c jl) x 
(T,k=N+i \ a k\) < h then Efeii \9k(n,j)\ <oo for n>N and j = 1, . . . , n. 

We omit the proof. 

4. Baxter's inequality for long memory processes. In this section we 
prove Baxter's inequality (1.5). 

Theorem 4.1. We assume (A2). Then there exists a positive constant 
M such that (1.5) holds for all n S N. 

Proof. Let r > 1 be chosen so that < rsin(7rd) < 1. Then we have 
(3.10). By Proposition 3.2 and (2.20), we may take a positive integer N 
such that both (3.4) and a n > hold for n>N. Pick 5 G (0,d). By (2.20) 
and [4], Theorem 1.5.6(iii) (Potter-type bounds), we may assume that 

(4.1) a m /a n <2m&x{(n/m) 1+d - S ,(m/n) 1+d+S }, m,n>N. 

By Theorem 2.9 and (3.2), we have, for n> N + 3, 

n—1 n—1 oo oo 

Y \<t>n-\j ~ <f>j\ <Q) Y \ a u+j\ Xl d2fc ( n ' n ) 
j=l j=l u=0 k=l 

n—1 oo oo 

+ c ®Y Y \ a u+n-j\ Y d2k - i ( n ' u } 

j=l u=0 k=l 
oo oo n—1 

= c °YY d k( n > u ) Y i a «+ii 

A;=1m=0 3=1 



where 



c {G 1 (n)+G 2 (n)}, 



oo oo JV+1 

Gi{n) := Y Y dk ^ u ^> Y K+il' 
k=lu=0 j=l 

oo oo n—1 

G 2 (n) := Y Y dk ( n ' M ) Y a «+i • 

k=lu=0 j=N+2 
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For n > N + 3, we have 



Gi(n) <n~ x 



^/ fc (0){rsin(vrd)}' 



k=l 



N+l oo 



J j=l u=0 



and 



00 ;>oo ;>n 

G 2 (n) < V / du-d k (n,[u]) a [u]+[s]+2 ds 
k=l Jo Jn 

= na n f:ndu.nd k (n,lnu])f 1 Q -M±M±l 
£t[ Jo JN/n a n 

By (4.1), we have, for u > 0, n > + 3 and iV/n < s < 1, 



(is. 



— — - — ' — - — < 2 max . 

a n l\ [nu\ + [ns\ + 2 



l-<5+<i 



' V [nu] + [ws] + 2 
< 2max{(u + s)" (1+d+5) , (u + 
Hence, by (3.4), 

r 1 /*oo 

G 2 (n) < 2 / ds / {(« + s )-( 1+d + 5 ) + ( u + s )"( 1+d - 5 )} 



'0 jo 

00 

x na n 



£/ fc (0){rsin(7rd)}' 



Uc=l 



< 2 



+ 



x na 



(£ + d)(l-d-5) (d-<5)(l -d + <5) 

00 

^/ fc (0){r S in(vrd)} fc 



.fc=i 



Combining these estimates, we obtain 

{n-l 
n^(n) ^ |(^n-l,, 



3 rji 



< 00. 



Since J2kL n 4>k = Co X^fcLn a k ~ Co sin(7rd)/{7rra d ^(n)} as n — > 00, the theorem 
follows. □ 
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